Skip to content

Prevent stale queue snapshots from regressing workflow completion state#9043

Merged
lstein merged 28 commits intoinvoke-ai:mainfrom
JPPhoto:queue-item-status-sequence
Apr 23, 2026
Merged

Prevent stale queue snapshots from regressing workflow completion state#9043
lstein merged 28 commits intoinvoke-ai:mainfrom
JPPhoto:queue-item-status-sequence

Conversation

@JPPhoto
Copy link
Copy Markdown
Collaborator

@JPPhoto JPPhoto commented Apr 10, 2026

Summary

Fix a race where fast workflow runs could stay visible as working after the queue had already drained.

This PR adds a per-item status_sequence to queue items and queue_item_status_changed events, persists it in invokeai/app/services/shared/sqlite_migrator/migrations/migration_28.py, and uses it in invokeai/frontend/web/src/features/controlLayers/components/StagingArea/state.ts to ignore stale queue snapshots that arrive after newer status updates. It also updates invokeai/frontend/web/src/services/events/setEventListeners.tsx so queue caches carry the new sequence immediately from socket events.

Related Issues / Discussions

QA Instructions

  1. Back up your database or run with an in-memory database; this PR involves a DB migration.

  2. Build the frontend as normal.

  3. Run backend tests:
    pytest tests/test_session_queue.py tests/app/services/session_queue/test_session_queue_clear.py tests/app/services/session_queue/test_session_queue_status_sequence.py tests/app/routers/test_session_queue_sanitization.py tests/test_sqlite_migrator.py

  4. Run the staging area frontend tests:
    cd invokeai/frontend/web && ./node_modules/.bin/vitest run src/features/controlLayers/components/StagingArea/state.test.ts

  5. Try reproducing manually by running a very fast workflow (with one Integer Primitive node, for example) and confirming the staging area does not remain stuck in a pending or working state after completion.

Merge Plan

This touches queue event payloads and adds a DB migration in invokeai/app/services/shared/sqlite_migrator/migrations/migration_28.py, so it should be rebased cleanly before merge.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@github-actions github-actions Bot added python PRs that change python files services PRs that change app services frontend PRs that change frontend files python-tests PRs that change python tests labels Apr 10, 2026
@JPPhoto JPPhoto added backend PRs that change backend files frontend PRs that change frontend files and removed frontend PRs that change frontend files labels Apr 10, 2026
@JPPhoto JPPhoto force-pushed the queue-item-status-sequence branch from a510262 to 868a411 Compare April 12, 2026 14:21
@JPPhoto JPPhoto force-pushed the queue-item-status-sequence branch from 868a411 to 9134a38 Compare April 13, 2026 23:50
@lstein lstein self-assigned this Apr 14, 2026
JPPhoto and others added 9 commits April 14, 2026 11:23
…ches/306684-1776184069/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/318403-1776197224/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/558-1776426442/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/4860-1776428487/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/30802-1776472946/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/40315-1776547892/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/44449-1776563122/3153380b270b192d54042e0c70ca38e7f897a2cf
JPPhoto and others added 9 commits April 19, 2026 18:37
…ches/61407-1776641836/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/64843-1776644718/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/79583-1776705700/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/93876-1776717315/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/99484-1776718892/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/103059-1776723590/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/119808-1776731940/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/125326-1776735930/3153380b270b192d54042e0c70ca38e7f897a2cf
JPPhoto added 2 commits April 20, 2026 20:59
…ches/127970-1776736741/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/175532-1776819348/3153380b270b192d54042e0c70ca38e7f897a2cf
Copy link
Copy Markdown
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JPPhoto The last commit fixed the "stuck" issue I'd found in fast-running workflows, and I'ved this PR

I did run a code review with Claude Code and it had a few very minor suggestions. They seem reasonable, but I'll leave it up to you whether you wish to implement them before the merge. I'll check in again tomorrow to see what you prefer.

Blocking issues: None.

Non-blocking suggestions:

  1. session_queue_common.py:222 — status_sequence Pydantic default is None but DB default is 0; a one-line comment explaining the fallback-for-pre-migration
    intent would help future readers.
  2. state.ts:152 — _pruneSeenItemOrdering intentionally drops tracking for vanished items; a comment noting this is the eviction trade-off (vs. unbounded
    memory growth) would clarify intent.
  3. migration_30.py has no dedicated test like test_migration_27_creates_users_table. The generic idempotency test covers the framework but not the backfill
    path. Low risk, but a focused test would strengthen guarantees.
  4. test_session_queue_status_sequence.py:76 — existing cancel test uses never-dequeued items (seq goes to 1). A complementary dequeue-then-cancel case
    (verifying 1 → 2) would prove the counter continues rather than resets. The lifecycle test already exercises this path, so it's a mild gap.
  5. _shouldAcceptQueueItem equal-sequence tiebreaker falls through to rank comparison; worth a comment calling out that terminal-vs-terminal at same sequence
    will accept the later arrival (edge case that shouldn't occur but is handled sensibly).

Positive notes:

  • COALESCE(status_sequence, 0) + 1 is atomic and correctly handles both NULL and 0 in one expression.
  • Dual-track ordering (_seenItemStatusSequences + _seenItemStatusRanks fallback) cleanly handles missing-sequence events.
  • ?? undefined (not || undefined) correctly preserves status_sequence = 0.
  • Event handler records ordering before early-returns, so skipped events still advance the counter.
  • Schema.ts asymmetry (event field required, queue-item field optional) matches backend Pydantic correctly and appears properly auto-generated.

@JPPhoto
Copy link
Copy Markdown
Collaborator Author

JPPhoto commented Apr 23, 2026

@lstein Glad that fixed it. Those comments are now in the code and I've added the test case in test_session_queue_status_sequence.py. I don't think the migration needs a test.

@lstein lstein merged commit 5a0818a into invoke-ai:main Apr 23, 2026
16 checks passed
@JPPhoto JPPhoto deleted the queue-item-status-sequence branch April 23, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend PRs that change backend files frontend PRs that change frontend files python PRs that change python files python-tests PRs that change python tests services PRs that change app services

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants